-
Notifications
You must be signed in to change notification settings - Fork 2k
Fix streaming tool call merge for Qwen and OpenAI-compatible APIs #4794
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix streaming tool call merge for Qwen and OpenAI-compatible APIs #4794
Conversation
- Update MessageAggregator to handle tool calls without IDs - When tool call has no ID, merge with last tool call - Add comprehensive tests for streaming patterns Signed-off-by: ultramancode <[email protected]>
bf2c9ce to
d2057f3
Compare
Signed-off-by: ultramancode <[email protected]>
|
@ultramancode Thanks for the PR! A few questions to understand the fix:
Thanks |
|
@ilayaperumalg Thanks for the detailed questions!
Evidence from issue #4790 The symptom: These incomplete chunks reached MessageAggregator as separate ChatResponses, causing: This proves windowing failed. If Why windowing fails with QwenThe return choice.finishReason() == ChatCompletionFinishReason.TOOL_CALLS;Qwen's streaming either doesn't send the correct Related ecosystem issue:
Interestingly, Qwen's official spec documents So windowing fails → incomplete chunks leak through → MessageAggregator receives them. Why fix in MessageAggregator instead of OpenAiApi windowing? Windowing approach (problematic):
MessageAggregator approach (safe):
Example scenarios: Normal OpenAI (windowing works):
Buggy Qwen (windowing fails):
-> MessageAggregator acts as a safety net without needing to know if upstream processing succeeded or failed. A real-world concern: Integrating with existing LLM infrastructureThis isn't about supporting "broken" APIs—it's about bridging the gap between specification and implementation in real-world enterprise environments. A scenario I'm concerned about (based on my experience): I recently worked on an AI agent solution for a client where:
The constraint:
While I didn't use Spring AI for that project, I'm concerned that Spring AI users will face the same situation: needing to integrate with pre-existing LLM infrastructure that has implementation gaps, with no ability to fix upstream components. Thanks! |
Fixes #4790
Problem
When using OpenAI-compatible APIs like Qwen with streaming tool calls, subsequent chunks may not include the tool call ID. The current
MessageAggregatorusesaddAll()which creates separate, incomplete ToolCall objects for each chunk instead of merging them. This results in ToolCall objects with emptynamefields, causing:IllegalArgumentException: toolName cannot be null or emptyRoot Cause
Some OpenAI-compatible APIs (e.g., Qwen via OpenRouter) follow a streaming pattern where:
idandfunction.namefunction.argumentswithoutidExample:
Solution
Added
mergeToolCalls()method inMessageAggregatoras a safety net to handle tool call fragments that may not be properly merged at the API layer (e.g.,OpenAiStreamFunctionCallingHelper).This ensures that even when API-layer merging is incomplete or providers behave slightly differently, the aggregation layer can properly merge streaming tool call fragments.
This handles:
Changes
addAll()with newmergeToolCalls()method to properly handle streaming tool call fragmentsmergeToolCall()helper method for null-safe property mergingMessageAggregatorTestsshouldMergeToolCallsWithoutIds: Verifies Qwen streaming patternshouldMergeMultipleToolCallsWithMixedIds: Multiple tool callsshouldMergeToolCallsById: ID-based matching still worksTesting
All tests pass with actual Qwen streaming response pattern verified via OpenRouter API.
Example: